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f"^ I tation of the first n symbols. In this note we count the number of abelian squares and 

give an asymptotic estimate of this quantity. 



An abelian square is a string of length 2n where the last n symbols form a permu- 



c^ 1 Introduction 

An abelian square of length 2n is a string of the form xx', where |a:| = |x'| = n > and x' is a 
permutation of x. Two abelian squares in English are reappear and intestines. Of course, 
the permutation can be the identity, so ordinary squares such as murmur and hot shots are 
also considered to be abelian squares. 

Abelian squares were introduced by Erdos [31 p. 240] and since then have been extensively 
studied in the combinatorics on words literature (see, for example, [1, p. 37]). In this note 
we discuss enumerating the abelian squares over an alphabet of size k. 



2 Preliminaries 

Let fk{n) be the number of abelian squares of length 2n over an alphabet S with k letters. 
Without loss of generality, we assume that T, = {1,2, . . . ,k}. 

Given a string x with |x| = n, the signature of x is defined to be the vector enumerating 
the number of I's, 2's, etc. in x. (In computer science, this vector is sometimes called the 
Parikh vector.) For example, the signature of 213313 is (2, 1,3). Hence a string xx' is an 
abelian square iff the signature of x equals x'. 

The following table enumerates fk{n) for the first few values of k and n. 
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Examination of this table suggests that f2{n) = (^), and indeed, this can be proved as 
follows. Suppose we choose the positions of the I's in the first n symbols; if there are i of 
them, this can be done in (") ways. Once we choose these, the remaining symbols of the 
first n must be 2's. The last n symbols must have the same signature as the first n, and this 
can be done in (") ways. So we get 
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0<i<n 



The sequence f2{'n) is sequence A000984 in Sloane's On-line Encyclopedia of Integer Se- 
quences |7]. 

There is a nice combinatorial proof that this sum is actually ( ,") . Consider a string of 
length 2n, and choose n positions in it. If a position falls in the first half of the string, make 
it 1; if a position falls in the last half of the string, make it 2. Of the remaining unchosen 
positions, make them 2 if they fall in the first half and 1 if they fall in the last half. It 
is easy to see that this gives a bijection with the set of abelian squares. Thus we obtain 

We can now use this idea to evaluate fk{n) in terms of fk-i{n). Choose the positions of 
the I's in the first and last halves of the string; this can be done in (") ways. Now fill in 
the remaining n — 2i positions with k — 1 symbols in fk-i{n — i) ways. Thus 
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0<i<n 



0<i<n 



0<i<n 



For A; = 3 this gives 
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0<i<n 

The sequence fsin) is sequence A002893 in Sloane's On-line Encyclopedia of Integer Se- 
quences. 

More generally, we can write /fc^+fcal^) in terms of fki{n) and /fcal^)- We have 
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0<j<n 

To see this, suppose the first n symbols have i occurrences of the symbols 1, 2, ... , ki. Note 
that we can choose the positions where the symbols 1, . . . , /ci will go in the first n symbols 
in (") ways, and where they will go in the last n symbols in (^) ways. Once the positions 
are chosen, we can fill them in with 1, . . . , fci in fkx{i) ways. The remaining positions can 
be filled with the remaining symbols /ci + 1, /ci + 2, . . . , A;i + A;2 in fk2iP' — i) ways. Thus for 
ki = k2 = 2, we get 
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0<i<n 

The sequence fiin) is sequence A002895 in Sloane's On-line Encyclopedia of Integer Se- 
quences. 

Yet another formula for fk{n) is 

E 

jiiH \-nk=n 

which follows from choosing the signature of the first half of the string and then matching it in 
the second. Here n,- counts the number of occurrences of z, and ( " ) is the multinomial 

coefficient — i— r r- As we will see, this formula suffices to obtain the asymptotic behavior 

of fkin) as n — > oo. 

3 Asymptotics 

In what follows we shamelessly apply the factorial function to noninteger arguments, using 
the standard definition x\ = r(x + 1), where F is the well-known gamma function. 
First let's consider the asymptotics of 

n 
^ni n2 ■ ■ ■ Uk, 

We use an idea that is due (more or less) to Lagrange [5]. The maximum of the multinomial 
coefficient ([T]) occurs when n^ = |, so write ^j = | + Xi^fn. Thus 

n = y^ Ui = n+ y^ : 

l<i<k l<i<k 



andsoJ2i<i<kXi = ^- 

Stirling's formula states that 



n 



as n ^ cxD. 

Recall that Ui = j + Xi^fn. Using Taylor's formula 



! = e"'°s"-"y2^(l + 0(n-^)) (2) 



log(l + t/)=t/-|- + 0(t/3) (3) 



for y = ^, we get 



lognj = log y- + Xi^n\ 
fn /_ Xik 



, n f Xik 

log -^ + log I 1 + ^ 



'°^i+i 4^+ «««""')■ 



Hence 



n^logn^ = (I^+x.v^) (log^ + ^--^^ + 0(a;3n"^/2^ 
\K J \ k ^n 2 n 

- + Xi^/n] log - + ^/nxi + -kx] + 0(a;fn~^/^). 



k " '2 
n 1 



Thus, 

njlogn, -ni= i- + Xi^j log- + -kx] - - + 0{x\n~^/'^) (4) 

and hence if |xj| <rf for some < e < ^, we get 



^ (rij log ni - rij) = n log - - n + -A; ^ xM 

l<j<A: \ l<i<fc / 

where we have used the fact that YliKKk^i ~ ^■ 
Thus 

n (;^ + ^iv^) ! ~ exp I n log - - n + I -A; ^ xM 

V<i<k \ \ l<i<k J 



+ 0(n-i/2+3.)^ (5) 



n. 



+ 0{n'^'^+^^)U2'KJ^f/\ (6) 



Hence for |xj| < n*^ we get 

n \ n\ 



nin2 ■■■ UkJ Ui<i<kil + XiV^V- 



and hence 

n 



exp Llogk-'^ Yl ^n (2v™)(^-*^)/'A;^/' 

\ l<j<A; / 

fc"exp [ -^ Yl ^n (2™)(i-^')/2A;'=/2, 

\ l<i<fc / 

^ \ \<i<k / 



(7) 
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Now let's approximate the sum 

E 

with the multiple integral 

k^'^{2'Knf'^k^ / / ■ ■ ■ / exp -k V" x,^ j drixdn^ ■ ■ ■ duk-i = 
J±Jo_^^_h_^ V i<i<k J 

fc-i 

/OO /"OO /"OO / \ 

/ ■ ■ ■ / exp —k 2_^ ^1 ~ ^( /, ^if 1 dxidx2 ■ ■ -dxk-i- 



e^" {27171)^-^ k'^n^'' 



k-l 



where we have used the fact that drii = y/ndxi and x^ = —Xi — X2 — ■ ■ ■ — x^^i. 

Note that the integrand is guaranteed to be asymptotic to the quantity we want only if 
\xi\ < n^, but outside this region the integrand is exponentially small. 

In order to evaluate the multiple integral ([H]), we need three lemmas. 

Lemma 1. Ifa>0, then 



roo / ^2 \ 

/ exp (— (ax^ + bx + c)) dx = exp I c\ 



- c I 7rV2a-V2. 



Proof. This can essentially be found, for example, in ^ Eq. 3.323.2], but for completeness 
we give the proof (also see [6]). 



Complete the square, writing 

ax +bx + c = a\x + -— ] +c — — . 
V 2a/ 4a 

Make the substitution u = x + ^ to get 

/ exp (— (aa;^ + bx + c)) dx = exp I c\ / exp(— a'U^)(i'u. 

Now make the substitution v = a^l'^u to get 

/OO /"OO 

exp(— aM^)(iM = a""*^'^ / exp(— t>^)(it>. 
-OO J —OO 

The result now follows from the well-known evaluation j^ exp(—v'^)dv = tt^/^. D 

Lemma 2. Let S^^o = {J^iKiKm^l) + {Y.i<i<m^i) ' and for 1 < I < m define Sm,i by 

TT^/^ ( — — j exp(-5'm,/) = / exp(-S'm,i_i)(ia;z. (9) 

Then 

Proof. By induction on /. Clearly the result is true for / = 0. Now apply Lemma [T], with 
a = i$f , fe = ;|i T.i+2<j<m^3^ ^ud c = ^ T.i+2<j<m^] + ITT Ei+2<i<i<m ^^^i" ^e uow have 

/ + 2 v^ , , 2 ^-^ (1+1)2 (^Z^i+2<j<m ^i 






iXy -)' i/y T 



4a / + 1 ^ "^ ■ / + 1 ^ ^^* ' 4^ 

/TT E ^? + 7^ E ^*^^- -(/ + !)(/ + 2) I 5Z ^? + 2 E 

'+2<i<»n l+2<i<j<m ^ ^^ ' \l+2<j<m l+2<i<j<m 

(/ + 2)2-1 ^ 2 2(/ + 2)-2 ^ 

^ ^^ ^ l+2<j<m ^ '^ ' l+2<i<j<m 

L ~T O ^ — V 2 -^ \ — ^ 

l+2<j<m l+2<i<j<n 

n 



tXj ^t-t/ T 



Thus we get 
Lemma 3. 



oo ^oo ^oo 



i-Smfi) dxidx2 ■■■dxm = '^'^''^im + 1)-^/^ 
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Proof. Apply Lemma [2] iteratively, obtaining 

r r ■■■ r eM-S^,o)dx,dx,---dx^ = 7rV2(i)i/V/2Ai/2... ^1/2(^)1/2 
y_oo i-00 i-00 2 3 m+1 

"■ V ' 

m 

where we have used telescoping cancellation. D 

It now follows (by a change of variables) , that 

00 /"OO /"OO 

/ ■■■/ exp (-A;5fc_i,o) rfxicixa ■ ■ ■ cixfc-i = 7^('^-^)/2A;-^/^ (10) 

-oo J —oo J —oo 



fe-1 
and so 

j-^ \ni n2 ■ ■ ■ nkj 

n-i+n2-\ \-nk=n ^ 

We have proved 
Theorem 4. Let k be an integer > 2. Then, as n ^ oo, we have 

Mn) ~ A;2"+'=/2(4^^)(i-fc)/2_ 



4 Remark 

Our original motivation for estimating the number of abelian squares of length 2n over an 
alphabet of size k was an attempt to use the Lovasz local lemma [21 Chap. 5] to prove the 
existence of an infinite word avoiding abelian squares. However, since by Theorem H] the 
chance that a randomly chosen string of length 2n is an abelian square is asymptotically 

/fc(n)/P" ~ A;^/2(47rn)(^-'=)/2 _ Q^^{i-k)/2-^^ 
this approach seems unlikely to work. 
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